IMPACT - 97 - 02 1 Run - time Spatial Locality Detection and Optimization
نویسندگان
چکیده
As the disparity between processor and main memory performance grows, the number of execution cycles spent waiting for memory accesses to complete also increases. As a result, latency hiding techniques are critical for improved application performance on future processors. In this paper we examine the spatial locality characteristics of several applications, and show that spatial locality varies substantially across and within applications. We then present a microarchitecture scheme which detects and adapts to this varying spatial locality, dynamically adjusting the amount of data fetched on a cache miss. The Spatial Locality Detection Table, introduced in this paper, facilitates the detection of spatial locality across adjacent small cached blocks. Results from detailed simulations of several integer programs show signi cant speedups. The improvements are due to the reduction of con ict and capacity misses by utilizing small blocks and small fetch sizes when spatial locality is absent, and the prefetching e ect of large fetch sizes when spatial locality exists.
منابع مشابه
ARS: an adaptive runtime system for locality optimization
Shared memory programs running on Non-Uniform Memory Access (NUMA) machines usually face inherent performance problems stemming from excessive remote memory accesses. A solution, called the Adaptive Runtime System (ARS), is presented in this paper. ARS is designed to adjust the data distribution at runtime through automatic page migrations. It uses memory access histograms gathered by hardware ...
متن کاملIntelligent Methods for File System Optimization
The speed of I/O components is a major limitation of the speed of all other major components in today's computer systems. Motivated by this, we investigated several algorithms for eecient and intelligent organization of les on a hard disk. Total access time may be decreased if les with temporal locality also have spatial locality. Three intelligent methods based on le type, frequency, and trans...
متن کاملPhase Locality Detection Using a Branch Trace Buffer for Efficient Profiling in Dynamic Optimization
Efficient profiling is a major challenge for dynamic optimization because the profiling overhead contributes to the total execution time. In order to identify program hot spots for runtime optimization, the profiler in a dynamic optimizer must detect new execution phases and subsequent phase changes. Current profiling approaches used in prototype dynamic optimizers are interpretation or instrum...
متن کاملOptimized Thread Creation for Processor Multithreading
Due to the mismatch in the speed of the processor and the speed of the memory subsystem, modern processors spend a signi"cant portion (often more than 50%) of their execution time stalling on cache misses. Processor multithreading is an approach that can reduce this stall time; however processor multithreading increases the cache miss rate and demands higher memory bandwidth. In this paper, a n...
متن کاملA memory-layout oriented run-time technique for locality optimization
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout oriented approach to exploit cache locality for parallel loops at run-time on Symmetric Multi-Processor (SMP) systems. Guided by applicationdependent hints and the targeted cache architecture, it reorganizes and partit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997